Active watermarks and watermark agents

ABSTRACT

Techniques for protecting the security of digital representations, and of analog forms made from them. The techniques include authentication techniques that can authenticate both a digital representation and an analog form produced from the digital representation, an active watermark that contains program code that may be executed when the watermark is read, and a watermark agent that reads watermarks and sends messages with information concerning the digital representations that contain the watermarks. The authentication techniques use semantic information to produce authentication information. Both the semantic information and the authentication information survive when an analog form is produced from the digital representation. In one embodiment, the semantic information is alphanumeric characters and the authentication information is either contained in a watermark embedded in the digital representation or expressed as a bar code. With the active watermark, the watermark includes program code. When a watermark reader reads the watermark, it may cause the program code to be executed. One application of active watermarks is making documents that send messages when they are operated on. A watermark agent may be either a permanent resident of a node in a network or of a device such as a copier or it may move from one network node to another. In the device or node, the watermark agent executes code which examines digital representations residing in the node or device for watermarked digital representations that are of interest to the watermark agent. The watermark agent then sends messages which report the results of its examination of the digital representations. If the watermarks are active, the agent and the active watermark may cooperate an the agent may cause some or all of the code than an active watermark contains to be executed.

CROSS REFERENCE TO RELATED PATENT APPLICATIONS

This application is a divisional of U.S. Ser. No. 09/070,597, which hasthe same inventor, title, and assignee as the present application andwhich is hereby incorporated into the present application by reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The invention relates generally to digital representations of images andother information and more specifically to techniques for protecting thesecurity of digital representations and of analog forms produced fromthem.

2. Description of the Prior Art

Nowadays, the easiest way to work with pictures or sounds is often tomake digital representations of them. Once the digital representation ismade, anyone with a computer can copy the digital representation withoutdegradation, can manipulate it, and can send it virtuallyinstantaneously to anywhere in the world. The Internet, finally, hasmade it possible for anyone to distribute any digital representationfrom anywhere in the world

From the point of view of the owners of the digital representations,there is one problem with all of this: pirates, too, have computers, andthey can use them to copy, manipulate, and distribute digitalrepresentations as easily as the legitimate owners and users can. If theowners of the original digital representations are to be properlycompensated for making or publishing them, the digital representationsmust be protected from pirates. There are a number of differentapproaches that can be used:

the digital representation may be rendered unreadable except by itsintended recipients; this is done with encryption techniques;

the digital representation may be marked to indicate its authenticity;this is done with digital signatures;

the digital representation may contain information from which it may bedetermined whether it has been tampered with in transit; thisinformation is termed a digest and the digital signature often includesa digest;

the digital representation may contain a watermark, an invisibleindication of ownership which cannot be removed from the digitalrepresentation and may even be detected in an analog copy made from thedigital representation; and

the above techniques can be employed in systems that not only protectthe digital representations, but also meter their use and/or detectillegal use.

For an example of a system that uses encryption to protect digitalrepresentations, see U.S. Pat. No. 5,646,999, Saito, Data CopyrightManagement Method, issued Jul. 8, 1997; for a general discussion ofdigital watermarking, see Jian Zhao, “Look, It's Not There”, in: BYTEMagazine, January, 1997. Detailed discussions of particular techniquesfor digital watermarking may be found in E. Koch and J. Zhao, “TowardsRobust and Hidden Image Copyright Labeling”, in: Proc. Of 1995 IEEEWorkshop on Nonlinear Signal and Image Processing, Jun. 20-22, 1995, andin U.S. Pat. No. 5,710,834, Rhoads, Method and Apparatus Responsive to aCode Signal Conveyed through a Graphic Image, issued Jan. 20, 1998. Foran example of a commercial watermarking system that uses the digitalwatermarking techniques disclosed in the Rhoads patent, see DigimarcWatermarking Guide, Digimarc Corporation, 1997, available at in March,1998 at http://www.digimarc.com.

FIG. 1 shows a prior-art system 101 which employs the above protectiontechniques. A number of digital representation clients 105, of whichonly one, digital representation client 105(j) is shown, are connectedvia a network 103 such as the Internet to a digital representationserver 129 which receives digital representations from clients 105 anddistributes them to clients 105. Server 129 includes a data storagedevice 133 which contains copied digital representations 135 fordistribution and a management data base 139. Server 129 further includesa program for managing the digital representations 135, a program forreading and writing watermarks 109, a program for authenticating adigital representation and confirming that a digital representation isauthentic 111, and a program for encrypting and decrypting digitalrepresentations 113. Programs 109, 111, and 113 together make upsecurity programs 107.

Client 105 has its own versions of security programs 107; it further haseditor/viewer program 115 which lets the user of client 105 edit and/orview digital representations that it receives via network 103 or thatare stored in storage device 117. Storage device 117 as shown containsan original digital representation 119 which was made by a user ofclient 105 and a copied digital representation 121 that was receivedfrom DR Server 129. Of course, the user may have made originalrepresentation 119 by modifying a copied digital representation.Editor/viewer program 115, finally, permits the user to output digitalrepresentations to analog output devices 123. Included among thesedevices are a display 123, upon which an analog image 124 made from adigital representation may be displayed and a printer 127 upon which ananalog image 126 made from the digital representation may be printed. Aloudspeaker may also be included in analog output devices 123. Theoutput of the analog output device will be termed herein an analog formof the digital representation. For example, if the output device is aprinter, the analog form is printed sheet 126; if it is a displaydevice, it is display 124.

When client 105(j) wishes to receive a digital representation fromserver 129, it sends a message requesting the the digital representationto server 129. The message includes at least an identification of thedesired digital representation and an identification of the user.Manager 131 responds to the request by locating the digitalrepresentation in CDRs 135, consulting management data base 139 todetermine the conditions under which the digital representation may bedistributed and the status of the user of client 105 as a customer. Ifthe information in data base 139 indicates to manager 131 that thetransaction should go forward, manager 131 sends client 105(j) a copy ofthe selected digital representation. In the course of sending the copy,manager 131 may use watermark reader/writer 109 to add a watermark tothe digital representation, use authenticator/confirmer 11 to addauthentication information, and encrypter/decrypter 113 to encrypt thedigital representation in such a fashion that it can only be decryptedin DR client 105(j).

When client 105() receives the digital representation, it decrypts itusing program 113, confirms that the digital representation is authenticusing program 111, and editor/viewer 115 may use program 109 to displaythe watermark. The user of client 105(j) may save the encrypted orunencrypted digital representation in storage 117. The user of client105(j) may finally employ editor/viewer 115 to decode the digitalrepresentation and output the results of the decoding to an analogoutput device 123. Analog output device 123 may be a display device 125,a printer 127, or in the case of digital representations of audio, aloudspeaker.

It should be pointed out that when the digital representation isdisplayed or printed in analog form, the only remaining protectionagainst copying is watermark 128, which cannot be perceived in theanalog form by the human observer, but which can be detected by scanningthe analog form and using a computer to find watermark 128. Watermark128 thus provides a backup to encryption: if a digital representation ispirated, either because someone has broken the encryption, or morelikely because someone with legitimate access to the digitalrepresentation has made illegitimate copies, the watermark at leastmakes it possible to determine the owner of the original digitalrepresentation and given that evidence, to pursue the pirate forcopyright infringement and/or violation of a confidentiality agreement.

If the user of client 105(j) wishes to send an original digitalrepresentation 119 to DR server 129 for distribution, editor/viewer 115will send digital representation 119 to server 129. In so doing,editor/viewer 115 may use security programs 107 to watermark the digitalrepresentation, authenticate it, and encrypt it so that it can bedecrypted only by DR Server 129. Manager 131 in DR server 129 will, whenit receives digital representation 119, use security programs 107 todecrypt digital representation 119, confirm its authenticity, enterinformation about it in management data base 139, and store it instorage 133.

In the case of the Digimarc system referred to above, manager 131 alsoincludes a World Wide Web spider, that is, a program that systematicallyfollows World Wide Web links such as HTTP and FTP links and fetches thematerial pointed to by the links.

Manager program 131 uses watermark reading/writing program to read anywatermark, and if the watermark is known to management database 139,manager program 131 takes whatever action may be required, for example,determining whether the site from which the digital representation wasobtained has the right to have it, and if not, notifying the owner ofthe digital representation.

While encryption, authentication, and watermarking have made it mucheasier for owners of digital representations to protect their property,problems still remain. One such problem is that the techniques presentlyused to authenticate digital documents do not work with analog forms;consequently, when the digital representation is output in analog form,the authentication is lost. Another is that present-day systems formanaging digital representations are not flexible enough. A third isthat watermark checking such as that done by the watermark spiderdescribed above is limited to digital representations available on theInternet. It is an object of the present invention to overcome the aboveproblems and thereby to provide improved techniques for distributingdigital representations.

SUMMARY OF THE INVENTION

In one aspect, the invention is an active watermark, that is, awatermark in which the information included in the watermark includesprogram code that can be executed when the watermark is read. What theprogram code does is of course completely arbitrary. For example, thecode in the active watermark can send a message each time a particularoperation is performed on the digital representation containing theactive watermark. One use for such an active watermark is for billing:each time a digital representation with an active watermark is copied,for instance, the digital representation may send a message to a billingserver. Another use is destroying the digital representation if a userattempts an operation for which the user has no privileges. In thisaspect, the invention includes apparatus and methods for making andreading active watermarks. The methods and apparatus for making activewatermarks may be employed anywhere present-day watermark makers areemployed, and the methods and apparatus for reading active watermarksmay be employed anywhere present-day watermark readers are employed.

In another aspect, the invention is a watermark agent which is locatedin a device upon which digital representations containing watermarks areresident. The watermark agent reads the watermark in a digitalrepresentation and performs actions ranging from sending a message tothe user through sending a message to a monitoring agent, moving thedigital representation, or changing its access rights to destroying thedigital representation. Some watermark agents are mobile. A mobilewatermark agent moves from node to node in a network. In each node, itexamines the watermarks on digital representations stored in the nodeand sends messages reporting its findings to a monitoring agent locatedin the network. When a watermark agent encounters a digitalrepresentation with an active watermark, the watermark agent may executethe program code contained in the active watermark.

Other objects and advantages of the invention will be apparent to thoseskilled in the arts to which the invention pertains upon perusing thefollowing Detailed Description and Drawing, wherein:

BRIEF DESCRIPTION OF THE DRAWING

FIG. 1 is a block diagram of a prior-art system for securelydistributing digital representations;

FIG. 2 is a diagram of a first embodiment of an analog form that can beauthenticated;

FIG. 3 is a diagram of a second embodiment of an analog form that can beauthenticated;

FIG. 4 is a diagram of a system for adding authentication information toan analog form;

FIG. 5 is a diagram of a system for authenticating an analog form;

FIG. 6 is a diagram of a system for making an active watermark;

FIG. 7 is an example of code from an active watermark;

FIG. 8 is a diagram of a system for executing the code in an activewatermark;

FIG. 9 is a diagram of a system for producing a watermark agent;

FIG. 10 is a diagram of a system for receiving a watermark agent;

FIG. 11 is a detailed diagram of access information 603; and

FIG. 12 is an example of code executed by a watermark agent.

The reference numbers in the drawings have at least three digits. Thetwo rightmost digits are reference numbers within a figure; the digitsto the left of those digits are the number of the figure in which theitem identified by the reference number first appears. For example, anitem with reference number 203 first appears in FIG. 2.

DETAILED DESCRIPTION

The following Detailed Description will first disclose a technique forauthenticating digital representations that survives output of an analogform of the digital representation, will then disclose activewatermarks, that is, watermarks that contain programs, and will finallydisclose watermark agents, that is, programs which examine the digitalwatermarks on digital representations stored in a system and therebylocate digital representations that are being used improperly.

Authentication that is Preserved in Analog Forms: FIGS. 2-5

Digital representations are authenticated to make sure that they havenot been altered in transit. Alteration can occur as a result oftransmission errors that occur during the course of transmission fromthe source of the digital representation to its destination, as a resultof errors that arise due to damage to the storage device being used totransport the digital representation, as a result of errors that arisein the course of writing the digital representation to the storagedevice or reading the digital representation from the storage device, oras a result of human intervention. A standard technique forauthentication is to make a digest of the digital representation andsend the digest to the destination together with the digitalrepresentation. At the destination, another digest is made from thedigital representation as received and compared with the first. If theyare the same, the digital representation has not changed. The digest issimply a value which is much shorter than the digital representation butis related to it such that any change in the digital representation willwith very high probability result in a change to the digest.

Where human intervention is a serious concern, the digest is made usinga one-way hash function, that is, a function that produces a digest fromwhich it is extremely difficult or impossible to learn anything aboutthe input that produced it. The digest may additionally be encrypted sothat only the recipient of the digital representation can read it. Acommon technique is to use the encrypted digest as the digital signaturefor the digital representation, that is, not only to show that thedigital representation has not been altered in transit, but also to showthat it is from whom it purports to be from. If the sender and therecipient have exchanged public keys, the sender can make the digitalsignature by encrypting the digest with the sender's private key. Therecipient can use the sender's public key to decrypt the digest, andhaving done that, the recipient compares the digest with the digest madefrom the received digital representation. If they are not the same,either the digital representation has been altered or the digitalrepresentation is not from the person to whom the public key used todecrypt the digest belongs. For details on authentication, see Section3.2 of Bruce Schneier, Applied Cryptography, John Wiley and Sons, 1994.

The only problem with authentication is that it is based entirely on thedigital representation. The information used to make the digest is lostwhen the digital representation is output in analog form. For example,if the digital representation is a document, there is no way ofdetermining from a paper copy made from the digital representationwhether the digital representation from which the paper copy was made isauthentic or whether the paper copy is itself a true copy of the digitalrepresentation.

While digital watermarks survive and remain detectable when a digitalrepresentation is output in analog form, the authentication problemcannot be solved simply by embedding the digest or digital signature inthe watermark. There are two reasons for this:

Watermarking changes the digital representation; consequently, if adigital representation is watermarked after the original digest is made,the watermarking invalidates the original digest, i.e., it is no longercomparable with the new digest that the recipient makes from thewatermarked document.

More troublesome still, when a digital representation is output inanalog form, so much information about the digital representation islost that the digital representation cannot be reconstructed from theanalog form. Thus, even if the original digest is still valid, there isno way of producing a comparable new digest from the analog form.

What is needed to overcome these problems is an authentication techniquewhich uses information for authentication which is independent of theparticular form of the digital representation and which will be includedin the analog form when the analog form is output. As will be explainedin more detail in the following, the first requirement is met byselecting semantic information from the digital representation and usingonly the semantic information to make the digest. The second requirementis met by incorporating the digest into the digital representation in afashion such that it on the one hand does not affect the semanticinformation used to make the digest and on the other hand survives inthe analog form. In the case of documents, an authentication techniquewhich meets these requirements can be used not only to authenticateanalog forms of documents that exist primarily in digital form, but alsoto authenticate documents that exist primarily or only in analog form,for example paper checks and identification cards.

Semantic Information

The semantic information in a digital representation is that portion ofthe information in the digital representation that most be present inthe analog form made from the digital representation if the human whoperceive the analog form is to consider it a copy of the original fromwhich the digital representation was made. For example, the semanticinformation in a digital representation of an image of a document is therepresentations of the alphanumeric characters in the document, wherealphanumeric is understood to include representations of any kind ofwritten characters or punctuation marks, including those belonging tonon-Latin alphabets, to syllabic writing systems, and to ideographicwriting systems. Given the alphanumeric characters, the human recipientof the analog form can determine whether a document is a copy of theoriginal, even though the characters may have different fonts and mayhave been formatted differently in the original document. There isanalogous semantic information in digital representations of picturesand of audio information. In the case of pictures, it is the informationthat is required for the human that perceives the analog form to agreethat the analog form is a copy (albeit a bad one) of the originalpicture, and the same is the case with audio information.

In the case of a document written in English, the semantic informationin the document is the letters and punctuation of the document. If thedocument is in digital form, it may be represented either as a digitalimage or in a text representation language such as those used for wordprocessing or printing. In the first case, optical character recognition(OCR) technology may be applied to the image to obtain the letters andpunctuation; in the second case, the digital representation may beparsed for the codes that are used to represent the letters andpunctuation in the text representation language. If the document is inanalog form, it may be scanned to produce a digital image and the OCRtechnology applied to the digital image produced by scanning.

Using Semantic Information to Authenticate an Analog Form: FIGS. 2 and 3

Because the semantic information must be present in the analog form, itmay be read from the analog form and used to compute a new digest. Ifthe old digest was similarly made from the semantic information in thedigital representation and the old digest is readable from the analogform, the new digest and the old digest can be compared as described inthe discussion of authentication above to determine the authenticity ofthe analog form.

FIG. 2 shows one technique 201 for incorporating the old digest into ananalog form 203. Analog form 203 of course includes semantic information205; here, analog form 203 is a printed or faxed document and semanticinformation 205 is part or all of the alphanumeric characters on analogform 203. Sometime before analog form 203 was produced, semanticinformation 205 in the digital representation from which analog form 203was produced was used to make semantic digest 207, which wasincorporated into analog form 203 at a location which did not containsemantic information 205 when analog form 203 was printed. In someembodiments, semantic digest 207 may be added to the original digitalrepresentation; in others, it may be added just prior to production ofthe analog form. Any representation of semantic digest 207 which isdetectable from analog form 203 may be employed; in technique 201,semantic digest 207 is a visible bar code. Of course, semantic digest207 may include additional information; for example, it may be encryptedas described above and semantic digest 207 may include an identifier forthe user whose public key is required to decrypt semantic digest 207. Insuch a case, semantic digest 207 is a digital signature that persists inthe analog form.

With watermarking, the semantic digest can be invisibly added to theanalog form. This is shown in FIG. 3. In technique 301, analog form 303again includes semantic information 305. Prior to producing analog form303, the semantic information in the digital representation from whichanalog form 303 is produced is used as described above to producesemantic digest 207; this time, however, semantic digest 207 isincorporated into watermark 307, which is added to the digitalrepresentation before the analog form is produced from the digitalrepresentation and which, like the bar code of FIG. 2, survivesproduction of the analog form. A watermark reader can read watermark 307from a digital image made by scanning analog form 303, and can therebyrecover semantic digest 207 from watermark 307. As was the case with thevisible semantic digest, the semantic digest in watermark 307 may beencrypted and may also function as a digital signature.

Adding a Semantic Digest to an Analog Form: FIG. 4

FIG. 4 shows a system 401 for adding a semantic digest to an analog form203. The process begins with digital representation 403, whose contentsinclude semantic information 205. Digital representation 403 is receivedby semantics reader 405, which reads semantic information 205 fromdigital representation 403. Semantics reader 405's operation will dependon the form of the semantic information. For example, if digitalrepresentation 403 represents a document, the form of the semanticinformation will depend on how the document is represented. If it isrepresented as a bit-map image, the semantic information will be imagesof alphanumeric characters in the bit map; if it is represented usingone of the many representations of documents that express alphanumericcharacters as codes, the semantic information will be the codes for thealphanumeric characters. In the first case, semantics reader 405 will bean optical character reading (OCR) device; in the second, it will simplyparse the document representation looking for character codes.

In any case, at the end of the process, semantics reader 405 will haveextracted some form of semantic information, for example the ASCII codescorresponding to the alphanumeric characters, from representation 403.This digital information is then provided to digest maker 409, whichuses it to make semantic digest 411 in any of many known ways. Dependingon the kind of document the semantic digest is made from and itsintended use, the semantic digest may have a form which requires anexact match with the new digest or may have a form which permits a“fuzzy” match. Digital representation 403 and semantic digest 411 arethen provided to digest incorporator 413, which incorporates arepresentation 207 of digest 411 into the digital representation used toproduce analog form 203. As indicated above, the representation must beincorporated in such a way that it does not affect semantic information205. Incorporator 413 then outputs the representation it produces toanalog form producer 415, which produces analog form 203 in the usualfashion. Analog form 203 of course includes semantic information 205 andrepresentation 207 of semantic digest 411. Here, the bar code is used,but representation 207 could equally be part of a watermark, as inanalog form 303. Components 405, 409, and 413 may be implemented asprograms executed on a digital computer system; analog form producer 415may be any device which can output an analog form.

Authenticating an Analog Form that has a Semantic Digest

FIG. 5 shows a system 501 for authenticating an analog form 503 that hasa semantic digest 207. Analog form 503 is first provided to semanticdigest reader 505 and to semantics reader 505. Semantic digest reader505 reads semantic digest 207; if semantic digest 207 is a bar code,semantic digest reader 505 is a bar code reader; if semantic digest 207is included in a digital watermark, semantic digest reader 505 is adigital watermark reader which receives its input from a scanner. Ifsemantic digest 505 must be decrypted, semantic digest reader 505 willdo that as well. In some cases, that may require sending the encryptedsemantic digest to a remote location that has the proper key.

Semantics reader 507 reads semantic information 305. If analog form 503is a document, semantics reader 507 is a scanner which provides itsoutput to OCR software. With other images, the scanner provides itsoutput to whatever image analysis software is required to analyze thefeatures of the image that make up semantic information 305. If analogform 503 is audio, the audio will be input to audio analysis software.Once the semantics information has been reduced to semantics data 509,it is provided to semantic digest maker 511, which makes a new semanticdigest 513 out of the information. To do so, it uses the same techniquethat was used to make old semantic digest 515. Comparator 517 thencompares old semantic digest 515 with new semantic digest 513; if thedigests match, comparison result 519 indicates that analog form 203 isauthentic; if they do not, result 519 indicates that they are notauthentic. What “match” means in this context will be explained in moredetail below.

“Matching” Semantic Digests

With the digests that are normally used to authenticate digitalrepresentations, exact matches between the old and new digests arerequired. One reason for this is that in most digital contexts,“approximately correct” data is useless; another is that the one-wayhashes normally used for digests are “cryptographic”, that is, the valueof the digest reveals nothing about the value from which it was made bythe hash function, or in more practical terms, a change of a single bitin the digital representation may result in a large change in the valueproduced by the hash function. Since that is the case, the onlycomparison that can be made between digests is one of equality.

In the context of authenticating analog forms, the requirement thatdigests be equal causes difficulties. The reason for this is thatreading semantic information from an analog form is an error-proneoperation. For example, after many years of effort, OCR technology hasgotten to the point where it can in general recognize characters with98% accuracy when it begins with a clean copy of a document that issimply formatted and uses a reasonable type font. Such an error rate isperfectly adequate for many purposes; but for semantic information ofany size, a new digest will almost never be equal to the old digest whenthe new digest is made from semantics data that is 98% the same as thesemantics data that was used to make the old semantic digest. On theother hand, if the semantics data obtained from the analog form is 98%the same as the semantics data obtained from the digital representation,there is a very high probability that the analog form is in fact anauthentic copy of the digital representation.

Precise Matches

Of course, if the semantic information is limited in size and tightlyconstrained, it may be possible to require that the digests be exactlyequal. For example, many errors can be eliminated if what is being readis specific fields, for example in a check or identification card, andthe OCR equipment is programmed to take the nature of the field'scontents into account. For example, if a field contains only numericcharacters, the OCR equipment can be programmed to treat the letters oand O as the number 0 and the letters l, i, or I as the number 1.Moreover, if a match fails and the semantic information contains acharacter that is easily confused by the OCR equipment, the charactermay be replaced by one of the characters with which it is confused, thedigest may be recomputed, and the match may again be attempted with therecomputed digest.

Fuzzy Matches

Where the semantic information is not tightly constrained, the digestsmust be made in such a fashion that closely-similar semantic informationproduces closely-similar digests. When that is the case, matchingbecomes a matter of determining whether the difference between thedigests is within a threshold value, not of determining whether they areequal. A paper by Marc Schneider and Shih-Fu Chang, “A Robust ContentBased Digital Signature for Image Authentication”, in: Proceedings ofthe 1996 International Conference on Image Processing, presents sometechniques for dealing with related difficulties in the area of digitalimaging. There, the problems are not caused by loss of information whena digital representation is used to make an analog form and by mistakesmade in reading analog forms, but rather by “lossy” compression ofimages, that is, compression using techniques which result in the lossof information. Because the lost information is missing from thecompressed digital representation, a digest made using cryptographictechniques from the compressed digital representation will not be equalto one made from the digital representation prior to compression, eventhough the compressed and uncompressed representations contain the samesemantic information. Speaking generally, the techniques presented inthe Schneider paper deal with this problem by calculating the digestvalue from characteristics of the image that are not affected bycompression, such as the spatial location of its features. Where thereare sequences of images, the digest value is calculated using the orderof the images in the sequences.

Analogous approaches may be used to compute the semantic digest used toauthenticate an analog form. For example, a semantic digest for adocument can be computed like this:

1. Set the current length of a digest string that will hold the semanticdigest to “0”;

2. Starting with the first alphanumeric character in the document,perform the following steps until there are no more characters in thedocument:

-   -   a. Select a next group of characters;    -   b. For the selected group,        -   i. replace characters in the group such as O, 0, o; I, i, l,            1; or c, e that cause large numbers of OCR errors with a            “don't care” character;        -   ii. make a hash value from the characters in the group;        -   iii. append the hash value to the semantic digest string;    -   c. return to step (a).

3. When there are no more characters in the document, make the semanticdigest from the digest string.

When computed in this fashion, the sequence of values in the semanticdigest string reflects the order of the characters in each of thesequences used to compute the digest. If the sequence of values in thenew semantic digest that is computed from the analog form has a highpercentage of matches with the sequence of values in the old semanticdigest, there is a high probability that the documents contain the samesemantic information.

Applications of Authentication with Analog Forms

One area of application is authenticating written documents generally.To the extent that the document is of any length and the digest iscomputed from a significant amount of the contents, the digest will haveto be computed in a fashion which allows fuzzy matching. If the digestis computed from closely-constrained fields of the document, exactmatching may be employed.

Another area of application is authenticating financial documents suchas electronic cash, electronic checks, and bank cards. Here, the fieldsfrom which the digest is computed are tightly constrained and an exactmatch may be required for security. In all of these applications, thedigest or even the semantic information itself would be encrypted asdescribed above to produce a digital signature.

Universal Paper & Digital Cash

Digital cash is at present a purely electronic medium of payment. Agiven item of digital cash consists of a unique serial number and adigital signature. Authentication using semantic information permitsdigital cash to be printed as digital paper cash. The paper cash isprinted from an electronic image which has a background image, a serialnumber, and a money amount. The serial number and the money amount arethe semantic information. The serial number and the money amount areused to make a digital signature and the digital signature is embeddedas an electronic watermark into the background image. The paper cash canbe printed by any machine which needs to dispense money. Thus, an ATM,can dispense digital paper cash instead of paper money. Similarly, avending machine can make change with digital paper cash and a merchantcan do the same. The digital paper cash can be used in the same way aspaper money. When a merchant (or a vending machine) receives the digitalpaper cash in payment, he or she uses a special scanner (including OCRtechnology and a watermark reader) to detect the watermark (i.e. theserial number and money amount) from the printed image, and send them tothe bank for verification in the same fashion as is presently done withcredit cards.

Digital Checks

Digital checks can be made using the same techniques as are used fordigital paper cash. The digital check includes a background image, anidentifier for the bank account, an amount to be paid, and the name ofthe payer. The payer's private key is used to make a digital signaturefrom at least the identification of the bank and the amount to be paid,and the digital signature is embedded as an electronic watermark in thebackground image. Writing a digital check is a three-step process: enterthe amount, produce the digital signature from the bank account numberand the amount using the payer's private key, and embed the digitalsignature into the background image. The bank verifies the check bydetecting the watermark from the digital check., decrypting the digitalsignature with the payer's public key, and comparing the bank accountnumber and the amount from the image with the bank account number andthe amount on the face of the check. A digital check can be used ineither electronic form or paper form. In the latter case, a scanner(including OCR technology and watermark reader) is needed to read thewatermark from the paper check.

Authentication of Identification Cards

The techniques described above for authenticating digital paper cash ordigital checks can be used with identification cards, includingbankcards. The card number or other identification information appearson the face of the card, is encrypted into a digital signature, and isembedded as a digital watermark in the background image of the bankcard.The encryption can be done with the private key of the institution thatissues the card. The merchant uses use a scanner to detect the digitalsignature (i.e. card number or other ID) from the card, and compare thesignature with the authentication stored inside the card. This techniquecan of course be combined with conventional authentication techniquessuch as the holographic logo.

Active Watermarks: FIGS. 6-8

Heretofore, digital watermarks have been nothing more than labels. Theyhave typically contained information such as identifiers for the ownerand creator of the digital representation and access controlinformation, for example, whether the digital representation may becopied or changed. Any kind of information can, however be placed in adigital watermark. If the information in the watermark describes anaction to be taken, the watermark becomes active, and the digitalrepresentation that contains the active watermark becomes active aswell. This is the reverse of the usual practice of encapsulating adigital representation in a program, as is done for example withMicrosoft Active Documents. Since digital watermarks are used in digitalsystems, the simplest way to make a watermark active is to includeprogram code in it which can be executed by the computer system uponwhich the digital representation is currently resident. From the pointof view of function, the code may be in any language for which thecomputer system can execute code. Practically, however, the code is bestwritten in a language such as Java™ or Perl for which most moderncomputer systems have interpreters.

FIG. 6 is an overview of a system 601 for making an active watermark619. The watermark is made from watermark information 603, whichcontains owner information 605, access information 607, andowner-defined information 609 as before, but additionally contains code611. Code 611 may be standard code for a given class of digitalrepresentations, or it may be defined specifically for a given digitalrepresentation. Code 611 may of course also use the other information inwatermark information 603 as data. Watermark information 603 and digitalrepresentation 613 are input into watermark maker 615, which outputsdigital representation 617, which is digital representation 613 modifiedto include watermark 619 made from watermark information 603. Sincewatermark information 603 includes code 611, watermark 619 is an activewatermark.

FIG. 11 shows a preferred embodiment of access information 607. Itincludes fields as follows:

an 8-bit permission (P) field which indicates the kind of access theuser may have: among the kinds are access which permits display, accesswhich permits storing a local copy, and access which permits printing.

a four-bit sensitivity field whose value indicates the degree ofsensitivity of the contents of the digital representation;

a 32-bit allowed location field which contains the IP address at whichthe digital representation is permitted to be located;

A 32-bit allowed period field which contains a period of time for whichuse of the digital representation has been permitted, and

FIG. 7 is an example of a program which might be found in code 611.Program 701 is written in the Java programming language. It is thencompiled into Java bytecodes which are interpreted by a Javainterpreter. These bytecodes are included in the digital watermark. Whenprogram 701 is executed, a message indicating that digitalrepresentation 617 containing the active watermark has been displayed issent via the Internet to a system that has been set up to monitor thedisplay of digital representation 617, perhaps for the purpose ofcomputing license fees. Line 703 of the code sets up a socket s by meansof which a datagram may be sent to the monitoring system. Line 709 ofthe code finds the current Internet address a of the monitoring systemwhich is specified at 705 by the name

syscop.crg.edu

. Line 1715 makes a new datagram packet for the message; it includes themessage content,

XYZ Displayed

and the internet address a. Line 1719, finally, uses the send operationassociated with the socket s to send the message, which the Internetwill deliver to the destination specified by a.

FIG. 8 shows a system 801 for executing the code in active watermark619. Digital representation 617 containing active watermark 619 is inputto watermark reader 803, which extracts watermark information 603 fromactive watermark 619. Info 603 includes code 611, which watermark reader803 provides to code interpreter 805. Code interpreter 805 interpretscode 611 to provide instructions which are executable by the computersystem upon which code interpreter 805 is running. In some embodiments,code interpreter is an interpreter provided by the computer system for astandard language such as Java; in others, interpreter 805 may beprovided as a component of watermark reader 803. In such embodiments,code 611 may be written in a language specifically designed for activewatermarks.

An active watermark 619 can cause the computer system in which theactive watermark is read to perform any action which can be described bythe code contained in the active watermark. The only limitations arethose imposed by the fact that the code is part of a watermark. One ofthese limitations is code size: code contained in a watermark mustnecessarily be relatively short; this limitation can be alleviated bycompressing the code using a “non-lossy” compression technique, that is,one which does not result in the loss of information. Another of thelimitations is that damage to the watermark may result in damage to thecode; consequently, active watermarks may not work well in situationswhere the digital representation 617 is involved in “lossy”manipulations, i.e., manipulations that cause loss of information indigital representation 617. Examples of such lossy manipulations areediting the digital representation, lossy translation of the digitalrepresentation from one format into another, lossy compression of thedigital representation, and producing a new digital representation froman analog form made from an old digital representation (as would be thecase, for example, if the code were obtained by reading the watermarkfrom a paper copy of a document).

Of course, techniques like those discussed above with reference to exactmatches of digests can be applied to recover code from a damagedwatermark or from an analog form, and to the extent that such techniquesare successful, active watermarks can be used even where lossymanipulations have taken place. For example, the watermark of an analogform may contain not only authentication information, but also code. Ifa copy machine contained a watermark reader and an interpreter for thecode used in the active watermark, the active watermark could be used,for instance, to prevent the copy machine from copying the analog form.

Among the things that can be done with active watermarks are thefollowing:

Customizing the manner in which the digital representation containingthe watermark is treated. Code 611 may differ for classes of digitalrepresentations, or may even be particular to a single digitalrepresentation;

Having a digital representation send message whenever it is displayed,copied, printed, or edited; for example, whenever a document with anactive watermark stored on a Web server is downloaded from the server,the active watermark can cause a message containing billing informationto be sent to a billing server.

Having the digital representation obtain locally-available information,which will then govern the behavior and usage of the digitalrepresentation;

Having a digital representation take protective action when a user triesto do something with it that is not permitted by access information 603;the protective action can range from a warning through sending a messageor blocking the intended action to destroying the digital representationthat contains the watermark.

Watermark Agents

Digital representations pose special problems for their owners because,like all digital data, they can be easily copied and distributed acrossa network. These properties of digital data, however, also make itpossible to automate monitoring of the distribution and use ofwatermarked digital representations. One way to do this is the watermarkspider. As mentioned in the Description of the Prior Art, the watermarkspider follows URLs to Web pages, which it retrieves and inspects forwatermarks. If it finds one that is of interest, it reports its findingsto a monitoring program. There are two problems with the watermarkspider: the first is that it is limited to digital representations whichare accessible by URLs that are available to the public. Thus, thewatermark spider would not be able to locate a copy of a digitalrepresentation on a WWW client, as opposed to on a WWW server. The otherproblem is that the spider must fetch each digital representation to beexamined across the network. Since digital representations are oftenlarge, the need to do this adds substantially to the volume of networktraffic.

Both of these problems can be solved by means of a network watermarkagent, that is, a watermark monitor which uses the network to move fromsystem to system where digital representations of interest might bestored. At each system, the watermark agent examines the system's filesystem for digital representations that have watermarks of interest. Ifthe watermark agent finds such a watermark, it may send a message withits findings via the network to a monitoring program. The watermarkagent is thus able to monitor digital representations that are notavailable via public URLs and uses network bandwidth only relativelyrarely and only to send messages that are small in comparison withdigital representations. In the following, the creation of a watermarkagent and its behavior in a system will both be explained in detail.

Creating a Watermark Agent: FIG. 9

FIG. 9 shows a watermark monitoring system 901 which creates anddispatches a watermark agent 925 across a network 103 and responds tomessages from the watermark agent. Watermark agent 925 is a programwhich is able to send itself from one node to another in network 103. Ineach node, it searches for watermarked documents and sends messages 935containing its findings to monitoring system 901, where message handler920 deals with the message, often by adding information to managementdata base 903.

Continuing in more detail, agent 925 has two main parts: agent code 927,which is executed when agent 925 reaches a node, and agent data 929,which contains information used by agent 925 in executing the code andin moving to the next node. At a minimum, agent code 927 will includecode which searches the node for files that may contain watermarks, codethat makes and sends any necessary messages to monitoring system 901,code that clones agent 925, and code that sends the clone on to the nextnode. As with the code in active watermarks, code 927 may be written inany language which can be executed in a node; either standard languagessuch as Java or a specialized watermark agent language may be used.

FIG. 12 provides an example written in the Java language of code 1201that a watermarking agent 925 might execute. Code 1201 searches the filesystem of the network node at which agent 925 is presently located forimages files, checks each image file for a watermark, if it finds awatermark, it performs the action required by the watermark and thenode, and makes a message containing a list of the actions it performed.

Continuing in more detail, code 1201 has two main sections,initialization 1203 and checking loop 1213. In initialization 1203, thefirst step is to instantiate a file filter to filter the files in thenode's file system (1205). Then a function of the filter which locatesimage files is used to make a list filenames of the names of the imagefiles in the file system (1207). Thereupon, information about theenvironment of the node that the agent needs to check watermarks isretrieved and placed in a variable env (1209); finally, a data structurecalled results is created to hold the results of the watermark checks1211.

In loop 1213, each file in filenames is examined in turn for a watermark(1215); if one is found, the actions indicated at 1217 are performed;first, the contents of the watermark are compared with the environmentinformation to obtain a result called match (1219). Then match is passedto a function which takes an action as determined by the value of matchand returns a value result which represents the result of the action(1221); finally, result is added to the data structure results (1223).then, at 1225, results is returned. Depending on how the watermark agentis being used, results can then be sent in a message to monitoringsystem 901.

Continuing in more detail with agent data 929, agent data 929 includes amap 931, digital representation description 933, keys 934, andparameters 921. Map 931 is a list of addresses in network 103. Eachaddress specifies an entity in network 103 that can provide anenvironment in which agent 925 can operate. The address may for examplebe an E-mail address or an IP address. Digital representationdescription 933 may be any information that describes the digitalrepresentations the agent is looking for. There may be a filter for thefile names and there may also be identification information from thewatermark. For example, if the files to be examined are .bmp files, thefilter might specify * .bmp, indicating that all files with the .bmpsuffix are to be examined. If a watermark key is needed to read thewatermark, keys 934 will contain that key and if the messages sent tomonitor system 901 are to be encrypted, keys 934 will contain the key tobe used in encrypting the messages. Any available technique may be usedto keep the keys secure. In a preferred embodiment, the parametersinclude

the email address for messages sent by the agent;

whether to report on files to which agent 925 had no access;

date of last monitoring and whether to check only files updated sincethat date;

whether to execute code 611 in an active watermark 619; and

termination conditions for agent 925.

Agent 925 is produced by agent generator 923, which can be implementedas a component of digital representation manager 131. Agent generator923 makes agent 925 from information in management data base 903 andagent parameters 921, which here are shown being provided interactivelyby a user of monitoring system 901, but may also be stored in managementdata base 903. The information in management data base 903 includesagent template 905(i), which is one of a number of templates that areused together with parameters 921 and other information in managementdata base 903 to generate agent code 927 for different kinds of agents925. Suspicious sites 907 is a list of network locations which might beworth examining. One source of information for sites that should be onthe list of suspicious sites 907 is of course messages received frompreviously-dispatched agents. Network information 909 is informationabout the network. Suspicious sites 907 and network information 909 areused together to make map 931 in agent 925. Digital representationinformation 911, finally, contains information about the digitalrepresentations that the agent will be looking for. The information isused to make DR Description 933. Information 911(i) for a given digitalrepresentation or group of digital representations may include awatermark key 913 for the digital representation's watermark andinformation from the watermark including owner ID 915, user ID 917, andpermitted use information 919. User ID 917 is an identification for theuser to whom the digital representation was downloaded. Once agent 905has thus been created by monitor system 901, agent 925 clones itselfmakes the clone into the kind of message required for the first entityspecified in map 931, and sends the message to the first entity.Thereupon, agent 925 terminates itself

Watermark Agents in Network Nodes: FIG. 10

FIG. 10 shows those components of a network node 1001 which are involvedin the monitoring of the node by a watermark agent 925 The componentsinclude:

agent engine 1003, which provides the environment in which agent 925executes its code and which is the entity to which the messagecontaining agent 925 is addressed;

file storage 1031, which contains the digital representations 1023 thatare of interest to agent 925;

file system 1029, which makes the digital representations 1023accessible as files;

watermark reader 1019, which reads the watermarks; and

code interpreter 1011, which interprets code in agent 925 and may alsointerpret the code in active watermarks, if that code is written in thesame language as the code used in agent 925.

SC 1035 is an optional secure coprocessor whose functions will beexplained in more detail in the discussion of security. Operation ofcomponents 1001 is as follows: When the message containing agent 925arrives in agent engine 1003 from network 103, agent engine 1003extracts agent 925 from the message and, at a convenient time, uses codeinterpreter 1011 to begin executing its code. What the code does is ofcourse arbitrary. Typically, it will do the following:

1. Send a message to system 901 indicating its arrival in the node,

2. Obtain the file filter from DRDESC 993 and give it to spider 1009 tomake a list of files that match the filter;

3. For each file on the list, do the following:

-   -   a. use spider 1009 to get the file ID for the file;    -   b. give file ID 1021 to watermark reader 1019, which uses the        watermark key from keys 934 to read the watermark, if any;    -   c. receive the watermark content 1017;    -   d. process watermark content 1017 as specified in code 927.        Actions might include sending a message to system 901 or passing        the code and data 1015 from an active watermark to code        interpreter 1011 for execution and receiving data 1013 in        return;

4. When all of the files have been processed,

-   -   a. sending a message to monitor system 901 with summary        information about the results of the visit and the next node to        be visited;    -   b. making a clone of agent 925 and sending the clone to the next        address specified in map 931; and    -   c. terminating agent 925.

As previously indicated, what a watermark agent can do is essentiallyarbitrary. If the documents being dealt with by the watermark agent haveactive watermarks, there are any number of ways of dividing the work ofprocessing of documents of interest between the code in the watermarkagent and the code in the active watermark. For instance, in the exampleabove, step 3(d) above could consist simply of executing the code in thedocument's active watermark.

The actions performed in step3(d) will typically be performed when theinformation in the watermark does not match the time or place whereagent 925 found the file or the time and/or place are inappropriate forthe file's access privileges. The action may be one of a pre-defined setspecified by parameters in parameters 921, it may be one defined byagent 925's code 927, or it may be one defined by an active watermark.Among the pre-defined actions are:

1. Destroy the file if the file's sensitivity level is very high;

2. Remove the file to a safe place if the sensitivity level is medium;

3. If the sensitivity level is low,

-   -   a. Warn the local administrator or webmaster of the violation if        the sensitivity is low;    -   b. Warn the recipient of the violation if the sensitivity is        low; or    -   c. Send a message to the file's owner reporting the violation if        the sensitivity is low;

4. If the sensitivity level is very low, send a message to monitor 901without disturbing the local host and local administrator.

Before going on to the next destination, watermark agent 925 may, waitfor a message from monitor 901 containing information about the nextdestination; the information may include:

The time of the last visit by an agent to the destination;

Information about the destination, for example detailed informationabout the digital representations to be examined there.

Nontraveling Watermark Agents

An important difference between a watermark agent and a watermark spideris that the watermark agent interacts with the document in the systemwhere the document is being stored or processed, and can thus performfar more functions than a watermark spider can. A further consequence ofthis difference is that a watermark agent need not travel, but cansimply be incorporated as a permanent component of a system. Forexample, a copier could include a watermark agent that read thewatermarks of paper documents being copied and prevented the copier fromcopying a document when its watermark indicated that the document wasnot to be copied. An important application of such a non-travelingwatermark agent would be to prevent the copying of paper digital cash.

Of course, if the copier had access to a network, even the“non-traveling” watermark agent could at least travel via the network tothe copier, and the network would provide a convenient way of updatingthe copier's watermark agent. “Non-traveling” watermark agents could ofcourse be distributed in a similar fashion to any system accessible viathe network.

Security Considerations

In some cases, for example in private military or business networks orsystems, agent 925 may not operate in a hostile environment, and monitor901 and agent engine 1003 may even be implemented as integral parts ofthe operating system. In most cases, however, agent 925 will beoperating in an environment which is hostile in at least four respects:

The node to which agent 925 sends itself is properly suspicious ofmessages from outside that contain code to be executed on the node;

to the extent that users on the node have violated the conditions underwhich they received a digital representation, they will want to hidetheir behavior and/or disable agent 925;

users on the node may want access to the keys and other data carried byagent 925; and

other users of network 103 may be interested in the content of themessages being exchanged between agent 925 and monitor 901.

The first of these problems is the “malicious agent problem”. It isgeneral to systems that download and execute code, and the samesolutions that are used in those cases can be applied to agent engine1003 and agent 925. For example, if the watermark agent's code iswritten in Java, the system on which it is run will have whateverprotections are provided by the Java interpreter. If managers of nodesare reasonably certain that agent engine 1003 and agents 925 will not doany damage to the nodes, they can be made to accept engine 1003 andagents 925 simply as a condition of downloading digital representations.For example, the transaction by which a digital representation managerdownloads a digital representation to a node might include a message toagent engine 1003 confirming the existence and operability of agentengine 1003. If the message were not properly answered, the digitalrepresentation manager might require that the node download and installagent engine 1003 before proceeding further with the transaction.

The remainder of these problems are termed “malicious node problems”.They can be solved by standard cryptographic techniques, as described inSchneier, supra. For example, the digital representation manager andeach agent engine 1003 might have a public key-private key pair. In thatcase, network information 909 would include the public key for the agentengine 1003 at a given node and the public keys for the agent engines1003 in the nodes to be visited would be included in map 931. Anymessage sent by the digital representation manager or by an agent 925 toan agent engine 1003 can be encrypted using agent engine 1003's publickey and any message sent by an agent engine 1003 or an agent 925 to adigital representation manager can be sent using the digitalrepresentation manager's public key. The public key for the digitalrepresentation manager can of course be included in agent 925's keys934. Authentication of messages can be done using standard digitalsignature techniques; for example, agent data 929 might include adigital signature from the digital representation manager for agent 925,messages from the digital representation manager to agent engine 1003can include the digital representation manager's digital signature, andmessages from agent engine 1003 can include agent engine 1003's digitalsignature.

If the watermarks are made using encryption techniques, as described inE. Koch and J. Zhao, “Towards Robust and Hidden Image CopyrightLabeling”, supra, the agent must have a way of decrypting the watermark.Depending on the situation, the watermark may be encrypted with thewatermark agent's public key and authenticated with a digital signaturein the same fashion as other messages to the agent engine or thewatermark may have its own key 913. In the former case, the, thewatermark agent's private key must be protected and in the latter,watermark key 913 must be protected, since access to the key wouldpermit those intent on stealing digital representations to remove oralter the digital representation's watermark. While agent 925 is intransit, watermark key 913 can be protected by encryption in the samefashion as the rest of the information in agent 925; once agent 925 hasbeen decrypted, watermark key 913 and agent engine 1003's private keymust be protected in the node currently being visited by agent 925.Agent engine 1003's private key must further be protected to prevent auser of the node currently being visited by agent 925 from using theprivate key to decrypt messages addressed to agent engine 1003 or appendagent 1003's digital signature.

One way of solving these key protection problems is a securecoprocessor, as described in J. D. Tyger and Bennet Lee, SecureCoprocessors in Electronic Commerce Applications, FIRST USENIX WORKSHOPON ELECTRONIC COMMERCE, JULY 1995. As shown at 1033, a securecoprocessor includes secure storage 1035 and a secure processor 1045.Secure storage 1035 may only be accessed via secure processor 1045, andsecure coprocessor 1033 is built in such a fashion that any attempt toaccess the information in secure coprocessor 1033 other than via secureprocessor 1045 results in the destruction of the information. Securecoprocessor 1033 is able to write information to and read informationfrom secure storage 1035 and also does encryption and decryption andmakes and verifies digital signatures. These operations may be doneentirely by executing code stored in secure storage 1035 or by means ofa combination of code and specialized hardware devices, as shown at 1047and 1049. The keys used in encryption, decryption, and in making digitalsignatures and verifying them are stored in secure storage 1035. Shownin FIG. 10 are WMkey 913 for the watermark, monitor public key 1039,agent engine public key 1041, and agent engine private key 1043. In thecase of the public keys, storage in secure storage 1035 is simply amatter of convenience, and secure processor 1045 may provide access tothe public keys in response to requests from components of node 1001; inthe case of WMkey 913 and agent engine 1003's private key 1043, thedecrypted keys 913 and 1043 are used only within secure processor 1033.

In the context of system 1001, when a message encrypted with agentengine 1003's public key 1041 arrives in agent engine 1003, agent engine1003 uses secure processor 1033 to decrypt the message; if the messagecontains an agent 925, agent engine 1003 also uses secure processor 1033to verify that agent 925's digital signature is from the digitalrepresentation manager and to decrypt WMkey 913. The decrypted key isnot returned to agent engine 1003, but is stored in secure storage 1035.SWM reader 1019 then uses secure coprocessor 1033 to decrypt thewatermark in the digital representation currently being checked by agent925.

Applications Using Watermark Agents

Since a watermark agent is programmed, it can do literally anything. Theflexibility of watermark agents is increased when their use is combinedwith that of active watermarks. One set of applications for watermarkagents is monitoring the use of copyrighted digital representations forthe copyright owner or a licensing agency. A copyright owner orlicensing agency, for example, may use watermark agents to locateunlicensed copies of digital representations or to periodically monitorthe use of licensed copies. A document with an active watermark couldincrement a usage count maintained in agent engine 1003 for a node eachtime it was printed and agent 925 could read the count on its visit tothe node, report the current count value back to management database903, and reset the counter.

Another set of applications is monitoring the use of digitalrepresentations to avoid liability for infringement. For example, acorporation might want to be sure that it has no unauthorized digitalrepresentations in its network and that the authorized ones are beingused in accordance with their license terms. The agent can monitor theuse of the digital representations in the corporate network in the samefashion as it does for the licensing agency. In this instance, themonitoring might even include destroying illegal copies.

Yet another set of applications is preventing unauthorized copying,scanning, or printing. This can be done by means of “nontraveling”watermark agents on servers and clients in the network or even by meansof “nontraveling” watermark agents built into devices such as copiers,scanners, or printers. For example, if a “No copy” watermark is embeddedin currency and a photocopier has an agent that looks for such awatermark and inhibits copying when it finds the watermark, thephotocopier will not make copies of currency.

Watermark agents can also be used to enforce military or corporatedocument security rules. In such an application, the document's securityclassification would be embedded in it as a watermark and the watermarkagent would search the military or corporate file systems and networksfor documents that were not being dealt with as required by theirsecurity classification. Examples would be documents that were in thewrong place or had been kept longer than a predetermined period. Actionstaken by the agent can range from reports and warnings through changingthe access rights to the document or moving the document to a safelocation to immediate destruction of the out-of-place document. Again,the agent that does this need not travel, but may simply be a permanentcomponent of the file system.

Watermark agents, finally, can be used to find lost documents inmilitary or business file systems or networks. If each document has aunique identifier associated with it and that identifier is on the onehand kept in a database and on the other hand incorporated into awatermark in the document, a watermark agent can simply be given theuniversal identifier and sent to search the file system or network forthe document. Once the agent has found it, it can report its location towhomever sent the agent out.

CONCLUSION

The foregoing Detailed Description has disclosed to those skilled in therelevant arts how to make and use documents with authentication thatwithstands conversion between an analog form and a digitalrepresentation of the document, how to make and use digitalrepresentations with active watermarks, and how to make and usewatermark agents, including mobile watermark agents and has furtherdisclosed the best mode presently known to the inventors for making suchauthentications, making active watermarks, and making watermark agents.The disclosed techniques are exceedingly general and may be implementedin many different ways for many different purposes. For example, theauthentication techniques may be based on any kind of semanticinformation and there are many ways of deriving the authenticationinformation from the semantic information, placing the authenticationinformation in the digital representation or the analog form, andcomparing the authentication information. Similarly, the program codefor an active watermark may be written in any programming language, maybe in source or object form, and may, when executed, perform arbitraryoperations. Watermark agents, too, may perform arbitrary actions andemploy various techniques for sending messages and traveling from nodeto node in a network. The watermark agents can of course performauthentication information and can execute code from active watermarks.

Since the techniques are so general and may be implemented in any numberof ways, the Detailed Description is to be regarded as being in allrespects exemplary and not restrictive, and the breadth of the inventiondisclosed herein is to be determined not from the Detailed Description,but rather from the claims as interpreted with the full breadthpermitted by the patent laws.

1. Apparatus for responding to watermarks in digital representations in a system that processes the digital representations containing the watermarks, the apparatus comprising: a watermark agent that includes a program which, when executed, reads a watermark in a digital representation and performs an action involving information contained in the watermark; and a watermark agent engine that executes the watermark agent, the digital representations whose watermarks are read by the watermark agent being neither fetched into the system by the watermark agent nor fetched into the system for the watermark agent.
 2. The apparatus for responding to watermarks set forth in claim 1 wherein: the system that processes the digital representations is a node in a network; and the system that processes the digital representations receives the watermark agent from another node in the network.
 3. The apparatus for responding to watermarks set forth in claim 1 wherein: the system that processes the digital representations is a node in a network; and the action performed by the program being executed is sending a message to another node in the network.
 4. The apparatus for responding to watermarks set forth in claim 1 wherein: the action performed by the program being executed is sending a message to the system that processes the digital representations.
 5. The apparatus for responding to watermarks set forth in claim 1 wherein: the system that processes the digital representations responds to the message by altering the manner in which it processes the digital representation whose watermark was read by the watermark agent. 